Playout policy adaptation with move features

نویسنده

Tristan Cazenave

چکیده

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We also propose to learn a policy not only using the moves but also according to the features of the moves. We test the resulting algorithms named Playout Policy Adaptation (PPA) and Playout Policy Adaptation with move Features (PPAF) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Knightthrough, Misere Knightthrough and Nogo. The experiments compare PPA and PPAF to Upper Confidence for Trees (UCT) and to the closely related Move-Average Sampling Technique (MAST) algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Memorizing the Playout Policy

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). Playout Policy Adaptation with move Features (PPAF) is a state of the art MCTS algorithm that learns a playout policy online. We propose a simple modification to PPAF consisting in memorizing the learned policy from one move to the next. We test PPAF with memorization (PPAFM) against PPAF and UCT fo...

متن کامل

Playout Policy Adaptation for Games

Monte Carlo Tree Search (MCTS) is the state of the art algorithm for General Game Playing (GGP). We propose to learn a playout policy online so as to improve MCTS for GGP. We test the resulting algorithm named Playout Policy Adaptation (PPA) on Atarigo, Breakthrough, Misere Breakthrough, Domineering, Misere Domineering, Go, Knightthrough, Misere Knightthrough, Nogo and Misere Nogo. For most of ...

متن کامل

Optimization of a packet video receiver under different levels of delay jitter: an analytical approach

This paper studies the problem of analyzing and designing optimal playout adaptation policies for packet video receivers (PVRs) that operate in a delay jitter inducing best-effort network, like the current Internet. The developed system model is built around the Ek/Di/1/N phase-type queue and allows for the effective modeling of key design and system parameters, such as: the level of delay jitt...

متن کامل

Nested Rollout Policy Adaptation with Selective Policies

Monte Carlo Tree Search (MCTS) is a general search algorithm that has improved the state of the art for multiple games and optimization problems. Nested Rollout Policy Adaptation (NRPA) is an MCTS variant that has found record-breaking solutions for puzzles and optimization problems. It learns a playout policy online that dynamically adapts the playouts to the problem at hand. We propose to enh...

متن کامل

Joint Power/Playout Control Schemes for Media Streaming over Wireless Links

We investigate transmission and playout policies for streaming media over a wireless link. In particular, we choose both the power at the transmitter and the playout rate at the receiver, in order to minimize the power consumption and maximize the media quality. We formulate the problem using a dynamic programming approach, study the structural properties of the optimal solution, develop justif...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Theor. Comput. Sci.

دوره 644 شماره

صفحات -

تاریخ انتشار 2016

Playout policy adaptation with move features

نویسنده

چکیده

منابع مشابه

Memorizing the Playout Policy

Playout Policy Adaptation for Games

Optimization of a packet video receiver under different levels of delay jitter: an analytical approach

Nested Rollout Policy Adaptation with Selective Policies

Joint Power/Playout Control Schemes for Media Streaming over Wireless Links

عنوان ژورنال:

اشتراک گذاری